Categories

Versions

You are viewing the RapidMiner Studio documentation for version 10.1 - Check here for latest version

Remove Document Parts (Text Processing)

Synopsis

Removes parts of a single document.

Description

This operator removes all parts of a document, which match a given regular expression. This might for example be helpful in order to delete tags in HTML. An regular expression matching all tags would be <[^>]*>

Input

  • document

    The document port.

Output

  • document

    The document port.

Parameters

  • deletion_regexThis regular expression specifies the parts of the string, which are deleted. Range: